Goto

Collaborating Authors

 Camarillo


GEPA: Reflective Prompt Evolution Can Outperform Reinforcement Learning

Agrawal, Lakshya A, Tan, Shangyin, Soylu, Dilara, Ziems, Noah, Khare, Rishi, Opsahl-Ong, Krista, Singhvi, Arnav, Shandilya, Herumb, Ryan, Michael J, Jiang, Meng, Potts, Christopher, Sen, Koushik, Dimakis, Alexandros G., Stoica, Ion, Klein, Dan, Zaharia, Matei, Khattab, Omar

arXiv.org Artificial Intelligence

Large language models (LLMs) are increasingly adapted to downstream tasks via reinforcement learning (RL) methods like Group Relative Policy Optimization (GRPO), which often require thousands of rollouts to learn new tasks. We argue that the interpretable nature of language can often provide a much richer learning medium for LLMs, compared with policy gradients derived from sparse, scalar rewards. To test this, we introduce GEPA (Genetic-Pareto), a prompt optimizer that thoroughly incorporates natural language reflection to learn high-level rules from trial and error. Given any AI system containing one or more LLM prompts, GEPA samples system-level trajectories (e.g., reasoning, tool calls, and tool outputs) and reflects on them in natural language to diagnose problems, propose and test prompt updates, and combine complementary lessons from the Pareto frontier of its own attempts. As a result of GEPA's design, it can often turn even just a few rollouts into a large quality gain. Across four tasks, GEPA outperforms GRPO by 10% on average and by up to 20%, while using up to 35x fewer rollouts. GEPA also outperforms the leading prompt optimizer, MIPROv2, by over 10% across two LLMs, and demonstrates promising results as an inference-time search strategy for code optimization.


An End-to-End Human Simulator for Task-Oriented Multimodal Human-Robot Collaboration

Shervedani, Afagh Mehri, Li, Siyu, Monaikul, Natawut, Abbasi, Bahareh, Di Eugenio, Barbara, Zefran, Milos

arXiv.org Artificial Intelligence

This paper proposes a neural network-based user simulator that can provide a multimodal interactive environment for training Reinforcement Learning (RL) agents in collaborative tasks involving multiple modes of communication. The simulator is trained on the existing ELDERLY-AT-HOME corpus and accommodates multiple modalities such as language, pointing gestures, and haptic-ostensive actions. The paper also presents a novel multimodal data augmentation approach, which addresses the challenge of using a limited dataset due to the expensive and time-consuming nature of collecting human demonstrations. Overall, the study highlights the potential for using RL and multimodal user simulators in developing and improving domestic assistive robots.


Multimodal Reinforcement Learning for Robots Collaborating with Humans

Shervedani, Afagh Mehri, Li, Siyu, Monaikul, Natawut, Abbasi, Bahareh, Di Eugenio, Barbara, Zefran, Milos

arXiv.org Artificial Intelligence

Robot assistants for older adults and people with disabilities need to interact with their users in collaborative tasks. The core component of these systems is an interaction manager whose job is to observe and assess the task, and infer the state of the human and their intent to choose the best course of action for the robot. Due to the sparseness of the data in this domain, the policy for such multi-modal systems is often crafted by hand; as the complexity of interactions grows this process is not scalable. In this paper, we propose a reinforcement learning (RL) approach to learn the robot policy. In contrast to the dialog systems, our agent is trained with a simulator developed by using human data and can deal with multiple modalities such as language and physical actions. We conducted a human study to evaluate the performance of the system in the interaction with a user. Our designed system shows promising preliminary results when it is used by a real user.


Evaluating Multimodal Interaction of Robots Assisting Older Adults

Shervedani, Afagh Mehri, Oh, Ki-Hwan, Abbasi, Bahareh, Monaikul, Natawut, Rysbek, Zhanibek, Di Eugenio, Barbara, Zefran, Milos

arXiv.org Artificial Intelligence

We outline our work on evaluating robots that assist older adults by engaging with them through multiple modalities that include physical interaction. Our thesis is that to increase the effectiveness of assistive robots: 1) robots need to understand and effect multimodal actions, 2) robots should not only react to the human, they need to take the initiative and lead the task when it is necessary. We start by briefly introducing our proposed framework for multimodal interaction and then describe two different experiments with the actual robots. In the first experiment, a Baxter robot helps a human find and locate an object using the Multimodal Interaction Manager (MIM) framework. In the second experiment, a NAO robot is used in the same task, however, the roles of the robot and the human are reversed. We discuss the evaluation methods that were used in these experiments, including different metrics employed to characterize the performance of the robot in each case. We conclude by providing our perspective on the challenges and opportunities for the evaluation of assistive robots for older adults in realistic settings.


COVID-19 Literature Topic-Based Search via Hierarchical NMF

Grotheer, Rachel, Huang, Yihuan, Li, Pengyu, Rebrova, Elizaveta, Needell, Deanna, Huang, Longxiu, Kryshchenko, Alona, Li, Xia, Ha, Kyung, Kryshchenko, Oleksandr

arXiv.org Machine Learning

A dataset of COVID-19-related scientific literature is compiled, combining the articles from several online libraries and selecting those with open access and full text available. Then, hierarchical nonnegative matrix factorization is used to organize literature related to the novel coronavirus into a tree structure that allows researchers to search for relevant literature based on detected topics. We discover eight major latent topics and 52 granular subtopics in the body of literature, related to vaccines, genetic structure and modeling of the disease and patient studies, as well as related diseases and virology. In order that our tool may help current researchers, an interactive website is created that organizes available literature using this hierarchical structure.


On Large-Scale Dynamic Topic Modeling with Nonnegative CP Tensor Decomposition

Ahn, Miju, Eikmeier, Nicole, Haddock, Jamie, Kassab, Lara, Kryshchenko, Alona, Leonard, Kathryn, Needell, Deanna, Madushani, R. W. M. A., Sizikova, Elena, Wang, Chuntian

arXiv.org Machine Learning

There is currently an unprecedented demand for large-scale temporal data analysis due to the explosive growth of data. Dynamic topic modeling has been widely used in social and data sciences with the goal of learning latent topics that emerge, evolve, and fade over time. Previous work on dynamic topic modeling primarily employ the method of nonnegative matrix factorization (NMF), where slices of the data tensor are each factorized into the product of lower-dimensional nonnegative matrices. With this approach, however, information contained in the temporal dimension of the data is often neglected or underutilized. To overcome this issue, we propose instead adopting the method of nonnegative CANDECOMP/PARAPAC (CP) tensor decomposition (NNCPD), where the data tensor is directly decomposed into a minimal sum of outer products of nonnegative vectors, thereby preserving the temporal information. The viability of NNCPD is demonstrated through application to both synthetic and real data, where significantly improved results are obtained compared to those of typical NMF-based methods. The advantages of NNCPD over such approaches are studied and discussed. To the best of our knowledge, this is the first time that NNCPD has been utilized for the purpose of dynamic topic modeling, and our findings will be transformative for both applications and further developments.


Sabrewing Plans a Cargo Drone That Can Detect and Avoid Obstacles

IEEE Spectrum Robotics

For a pilot, there really is no substitute for knowing what's in front of you. In a drone, that capability is known as detect and avoid, and so far, no drone has cleared the bar. Sabrewing, a startup in Camarillo, Calif., may well be the first to do it. It's working on a cargo-carrying drone that's due to begin test flights in 2020. "Even the military does it only in a kind of rudimentary way, say with a camera system; our system has to provide a way for the aircraft to autonomously avoid obstacles," says Ed De Reyes, the chief executive of Sabrewing.


Meet the Japanese tech guru who is betting big on the future of drones

The Japan Times

The only person in kimono at a recent government meeting on flying cars was Kotaro Chiba, a former online-game executive turned financier of a very specific kind. For Chiba, 44, who wears kimono on special occasions to show his pride in Japanese culture, is gathering money for what he calls the Drone Fund. It invests in unmanned vehicles to survey buildings, make deliveries and take aerial photos for tourist boards; hover scooters; and a pilotless cargo craft that's seeking to make it all the way from Japan to Silicon Valley in one go. Chiba is at the forefront of an industry that's only years away from changing our lives. In five to 10 years, the skies could be alive with drones delivering goods, according to McKinsey & Co.